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Despite  enormous  efforts,  our  understanding  the  structure  and  dynamics  of  a- 
synuclein  (ASN),  a  disordered  protein  (that  plays  a  key  role  in  neurodegenerative  dis¬ 
ease)  is  far  from  complete.  In  order  to  better  understand  sequence-structure-property 
relationships  in  a-SYNUCLEIN  we  have  developed  a  coarse-grained  model  using 
knowledge-based  residue-residue  interactions  and  used  it  to  study  the  structure  of 
free  ASN  as  a  function  of  temperature  (T)  with  a  large-scale  Monte  Carlo  simulation. 
Snapshots  of  the  simulation  and  contour  contact  maps  show  changes  in  structure 
formation  due  to  self-assembly  as  a  function  of  temperature.  Variations  in  the  residue 
mobility  profiles  reveal  clear  distinction  among  three  segments  along  the  protein 
sequence.  The  N-terminal  (1-60)  and  C-terminal  (96-140)  regions  contain  the  least 
mobile  residues,  which  are  separated  by  the  higher  mobility  non-amyloid  component 
(NAC)  (61-95).  Our  analysis  of  the  intra-protein  contact  profile  shows  a  higher 
frequency  of  residue  aggregation  (clumping)  in  the  N-terminal  region  relative  to  that 
in  the  C-terminal  region,  with  little  or  no  aggregation  in  the  NAC  region.  The  radius 
of  gyration  (Rs)  of  ASN  decays  monotonically  with  decreasing  the  temperature, 
consistent  with  the  finding  of  Allison  et  al.  (JACS,  2009).  Our  analysis  of  the 
structure  function  provides  an  insight  into  the  mass  (N)  distribution  of  ASN,  and 
the  dimensionality  (D)  of  the  structure  as  a  function  of  temperature.  We  find  that  the 
globular  structure  with  D  «  3  at  low  T,  a  random  coil,  I)  ~  2  at  high  T  and  in  between 
(2  <  D  <  3)  at  the  intermediate  temperatures.  The  magnitudes  of  D  are  in  agreement 
with  experimental  estimates  (J.  Biological  Chem  2002).  ©  2015  Author(s).  All  arti¬ 
cle  content,  except  where  otherwise  noted,  is  licensed  under  a  Creative  Commons 
Attribution  3.0  Unported  License.  [http://dx.doi.Org/10.1063/l.4927544] 


I.  INTRODUCTION 


a-synuclein  (ASN)1  is  a  140  amino  acid  protein  that  is  abundant  in  neurons  and  shows  exten¬ 
sive  interactions  with  the  phospholipid  membrane  and  other  proteins.  ASN  has  been  identified  as 
a  critical  component  in  the  onset  of  neurodegenerative  diseases  (synucleinopathies2  ),  including 
Parkinson’s  disease  (PD),  Lewy  body  dementia  and  Alzheimer  disease.  ASN13  has  been  extensively 
studied  as  an  intrinsically  disordered  (unstructured)  protein  with  a  monomeric  conformation  compa¬ 
rable  to  a  random  coil.  The  self-association  and  toxic  clumping  of  ASN  into  amyloid  fibrils  is  one  of 
the  prominent  pathological  characteristics  that  lead  to  PD  containing  secondary  structures.4'3  Recent 
studies6-1 1  have  shown,  however,  that  ASN  can  assume  a  number  of  structures  involving  a-helices, 
p-sheets,  trimers  and  tetramers  that  resist  aggregation.  The  primary  structure  of  140  residues  ASN1 
consists  of  three  domains,  (i)  an  N-terminal  region  (residues  1-60)  with  the  propensity  to  form 
an  alpha  helix  on  membrane  binding,  (ii)  a  central  region  (residues  61-95)  with  a  non-amyloid 
component  (NAC),  and  (iii)  an  acidic  C-terminal  region  (residues  96-140). 

ASN  has  been  extensively  studied,  and  the  contradictory  results6-11  on  the  structure  of  ASN  has 
continued  to  attract  enormous  interest  in  this  field. 12-21  For  example,  Mysling  et  al.22  has  studied  the 
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backbone  dynamics  of  soluble  ASN  oligomers  using  hydrogen/deuterium  exchange  and  found  that 
the  C-terminal  region  (residues  94-140)  and  N-terminal  (residues  4-17)  domains  are  very  mobile. 
Gurry  et  al.23  used  NMR  and  SAX  methods  to  investigate  the  ensemble  structures.  They2’  find 
that  the  large  fraction  of  ensemble  is  a  disordered  monomer  with  a  small  fraction  of  trimeric  and 
tetrameric  oligomers 

Enormous  efforts24-54  have  been  made  in  understanding  the  structure  and  dynamics  of  ASN  us¬ 
ing  NMR  data,  molecular  dynamics  (MD),  and  Monte  Carlo  simulations.  These  studies  enhance  our 
understanding  of  sequence-function  correlations,  but  also  reveal  significant  gaps  in  our  knowledge. 
Coskuner  and  Wise-Scira30  have  acknowledged  the  ‘valuable  insight’  gained  from  the  experimental 
studies  and  pointed  out  that  the  ‘atomic  level  information  with  dynamics  can  be  gained  from  theo¬ 
retical  studies  of  ASN  and  its  mutation  at  the  monomeric  level  in  solution  that  are  not  easily  observ¬ 
able  using  conventional  experimental  tools.  They  find  that  the  A53T  mutant-type  ASN  structures 
are  thermodynamically  more  stable  than  those  of  the  wild-type  protein  in  aqueous  solution  with 
higher  propensity  to  aggregate  due  to  increased  (3-sheet  formation  and  lack  of  ‘strong  intramolecular 
long-range  interaction.’  Jonsson  et  al.34  have  carried  out  Monte  Carlo  studies  of  free  ASN  and 
identified  both  the  disordered  phase,  and  phase  stabilized  by  p-strand  formation.  The  importance 
of  all-atom  MD  simulations31  33  and  its  ability  ‘to  capture  experimentally  observed  features’  have 
also  been  reported.  4  These  studies  have  shown  that  ‘it  remains  a  challenge  to  explore  the  full 
conformational  ensemble  populated  by  a  flexible  protein  of  this  length’  and  justified  ‘Monte  Carlo 
(MC)  rather  than  MD  methods’  involving  efficient  global  moves,  e.g.,  ‘pivot  update’.  Global  moves 
such  as  ‘pivot  update’33  adopted  and  emphasized  appropriately  in  this  study34  appear  ‘much  more 
efficient,  compared  to  “small  steps”  algorithms  like  MD’.  Efficiencies  and  pitfalls  of  both  MC  and 
MD  have  been  extensively  explored  in  modeling  polymers.36  In  these  studies  we  implement  the 
well-tested  and  efficient  procedures,  the  bond  fluctuation  scheme,36-3  in  modeling  the  structure  and 
dynamics  of  un-solvated  ASN. 

It  is  not  computationally  feasible  to  incorporate  all  atomic-scale  details  to  explore  the  complete 
conformational  phase  space  of  such  a  protein  as  large  as  ASN  using  the  force-fields  generally 
adopted  in  MD  simulations.  Coarse  grained  methods  have  been  used  to  carry  out  large-scale  com¬ 
puter  simulations  and  draw  meaningful  conclusions  about  the  sequence-structure-function  relation¬ 
ships.  Devising  interaction  potentials,  exploring  the  phase  space  selectively,  resorting  to  efficient 
and  effective  methods,  etc.  are  common  procedures  in  coarse-grained  modeling. 3X  49  Knowledge- 
based  contact  matrix50  57  (derived  from  an  ensemble  of  frozen  structures  of  protein  available  at  the 
protein  data  bank  (PDB))  has  been  extensively  used  to  develop  phenomenological  residue-residue 
interactions  to  understand  the  folding  dynamics  of  proteins.  As  in  our  previous  investigations,56,57 
we  will  use  the  classical  knowledge-based  interaction  due  to  Miyazawa  and  Jernigan  (MJ)51  and  one 
of  its  improved  versions  by  Betancourt  and  Thirumalai  (BT)53  to  study  the  structure  and  dynamics 
of  un-solvated  ASN  as  a  function  of  temperature. 


II.  MODEL  AND  METHOD 

In  our  coarse-grained  description,'’1-5'  ASN  is  represented  by  a  chain  of  140  nodes  tethered 
together  by  fluctuating  bonds  (3M) — (2D) — (3V) —  ... — (70V) —  ... — (140A)  where  each  node 
represents  an  amino  acid.  The  intra-molecular  details  of  the  amino  acids  are  thus  ignored  but  the 
specificity  is  captured  via  its  unique  interaction  energy.  The  protein  chain  is  placed  on  a  cubic  lattice 
in  a  random  configuration  at  the  start  of  the  simulation,  and  the  bond  length  between  consecutive 
nodes  varies  between  2  and  \J10  in  units  of  lattice  constant.36  Despite  the  simple  matrix  grid, 
this  approach  provides  ample  degrees  of  freedom  for  each  residue  to  move  and  peptide  bonds  to 
fluctuate,  much  more  than  that  with  the  fixed  bond  length  frequently  used  in  lattice  simulations. 36 
Small  step  (one  lattice  constant)  moves  retain  some  of  the  small  scale  details,  which  may  be  missed 
in  pivot  updates  and  other  arbitrary  moves.36  Because  of  the  efficiency  and  effectiveness,  such  a 
bond-fluctuating  mechanism  has  become  a  common  tool  in  computer  simulation  modeling  of  com¬ 
plex  systems  as  is  the  case  for  homopolymers,36  proteins,56,57  membranes,58  and  bio-functionalized 
nano  assemblies.59,60  Each  residue  interacts56,57  with  the  neighboring  residues  within  a  range  (rc) 
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with  a  generalized  Lennard-Jones  potential. 
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where  ry  is  the  distance  between  the  residues  at  site  i  and  j;  rc  =  \/8  and  cr  —  1  in  units  of  lattice 
constant.  The  potential  strength,  Sy,  is  unique  for  each  interaction  pair  with  appropriate  positive 
(repulsive)  and  negative  (attractive)  values  used  from  the  knowledge-based  contact  interactions 
MJ51  and  BT.53  The  number  of  interacting  lattice  sites  (within  the  range  of  the  interaction)  of  a 
residue  is  relatively  large  (on  the  order  of  a  hundred).  Because  of  the  efficiency  of  the  approach 
with  the  fluctuating  covalent  bond,  it  is  easier  to  explore  the  huge  conformational  space  while 
incorporating  ample  degrees  of  freedom.  36  37 

Each  tethered  residue  performs  its  stochastic  movements  with  the  Metropolis  algorithm  briefly 
described  as  follows.  A  residue  at  a  site  i  is  selected  randomly  to  move  to  a  neighboring  lattice 
site,/.  The  excluded  volume  constraints  are  then  checked,  including  the  covalent  bond  length  as  a 
result  of  the  proposed  random  move.  If  satisfied,  the  residue  is  moved  from  site  i  to  site  j  with  the 
Boltzmann  probability  exp(-AEy/T ),  where  AEy  —  Ej  -  E,  is  the  change  in  energy  between  its  new 
(Ej)  and  old  (£,)  configuration;  T  is  the  temperature  in  reduced  units  of  the  Boltzmann  constant 
and  the  energy  (sy),  and  an  attempt  to  move  each  residue  once  defines  the  unit  Monte  Carlo  step 
(MCS).35  We  monitor  a  number  of  local  and  global  physical  quantities  during  the  course  of  simula¬ 
tion,  including  the  energy  of  each  residue,  its  mobility,  mean  square  displacement  of  the  center  of 
mass  of  the  protein,  radius  of  gyration  and  its  structure  factor.  Simulations  are  performed  at  each 
temperature  for  a  sufficiently  long  time  (typically  ten  million  time  steps)  with  many  independent 
samples  (typically  100  samples)  to  estimate  the  average  values  of  these  quantities.  We  have  used 
a  64’  lattice  to  generate  all  the  data  presented  here  although  different  lattice  sizes  are  also  used  to 
verify  that  our  findings  are  independent  of  the  finite  size  qualitatively. 


III.  RESULTS  AND  DISCUSSION 

All  physical  quantities  and  variables  for  ASN  are  presented  in  arbitrary  (reduced)  units  as  noted 
above.  The  simulation  temperature  is  varied  to  assess  the  variations  in  these  physical  quantities,  and 
we  focus  in  a  range  where  most  changes  in  these  physical  quantities  occur  by  avoiding  the  low  and 
high  temperature  extremes. 

Typical  snapshots  from  the  simulations  as  a  function  of  temperature  are  presented  in  Figure  1 . 
Although  a  snapshot  does  not  provide  a  comprehensive  summary  of  the  average  ensemble  behavior 
(involving  millions  of  configurations),  it  provides  a  glimpse  into  some  of  the  characteristics.  At  low 
temperatures,  globular  structures  (multi-scale  segmental  scales  to  overall  global)  appear.  Raising 
the  temperature  opens  up  the  compact  structures  resulting  in  loop  formation,  and  fibrous  structures 
prevail  at  high  temperatures  where  localized  residue  aggregation  and  small  loops  persist. 

Contour  maps  of  the  snapshot  configurations  are  presented  in  Figure  2  for  a  representative  set 
of  temperatures  ranging  from  0.026  to  0.032.  These  results  show  a  systematic  reduction  in  looping 


FIG.  1 .  Snapshots  of  protein  configurations  at  the  end  of  10  million  time  steps  at  temperatures  T  =  0.026,  0.028,  0.030,  0.032 
(from  left  to  right).  The  nodes  (residues)  close  enough  to  experience  interactions  are  shown  by  spheres. 
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FIG.  2.  Residue  map  (neighboring  residues  along  the  contour  within  the  range  of  interaction)  of  protein  configurations  at  the 
end  of  107  time  steps  with  40  independent  runs  at  temperatures  T  =  0.026,  0.028.  0.030,  and  0.032  with  BT  potential.5-' 


with  increasing  the  temperature.  Although  it  is  difficult  to  compare  these  trends  quantitatively  with 
the  results  from  different  models,  the  general  features  in  distribution  of  contact  loops  (at  T  =  0.030) 
appear  similar  to  of  the  experimental  results  of  Dedmon  et  al.  (Figure  2).31 

The  mobility  of  each  ASN  residue  in  the  simulation  as  a  function  of  temperature  can  be  used 
to  identify  the  more  mobile  segments.  The  average  residue  mobility  (Mn)  is  defined  as  the  fraction 
of  its  successful  moves  per  unit  time  step,  and  Figure  3  shows  the  mobility  profile  of  ASN  at 
temperatures  T  =  0.026, 0.028, 0.030,  and  0.032.  The  least  mobile  residues  at  T  =  0.026  include  13E, 
19A,  21K,  23K,  28E,  32K,  34K,  35E,  43K,  45K,  46E,  56A,  57E,  58E,  60K  in  the  N-terminal  region  and  96K, 
97K,  102K,  104E,  105E,  129S,  131E  in  the  C-terminal  region.  The  intermediate  (61-95)  comprising  the 
non-amyloid  component  (NAC)  is  more  mobile  relative  to  the  N-terminal  and  C-terminal  regions. 
The  least  mobile  residues  are  predominantly  E,  K,  and  A,  and  such  rigidity  may  provide  seeds  for 
aggregation.  Aggregation  of  the  N-terminal  region  of  the  protein  may  be  enhanced  by  the  close 
proximity  of  attractive  residues  compared  to  the  C-terminal  region.  Apart  from  the  steric  constraints 
imposed  by  peptide  bonds,  the  residue  mobility  depends  on  the  local  structure.  The  data  shows  that 
looping  may  involve  residues  that  are  well  separated  in  the  linear  sequence,  especially  at  low  temper¬ 
atures  (Figure  2).  Raising  the  temperature  enhances  the  mobility,  and  at  the  highest  temperature 
(T  =  0.032)  all  residues  become  highly  mobile  as  the  residue-residue  interactions  become  less  impor¬ 
tant  and  the  protein  assumes  a  self-avoiding  walk  (SAW)  or  random  coil  conformation  (vida  infra). 

The  simulations  show  that  the  local  structure  and  mobility  are  correlated,  especially  at  low 
temperatures.  We  quantitate  the  local  structure  by  considering  the  average  number  (Nn)  of  surround¬ 
ing  residues  within  the  range  of  interaction  as  a  function  of  temperature  as  shown  in  Figure  4. 
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FIG.  3.  Average  mobility  (fraction  of  successful  moves  per  unit  time  step)  of  each  residue  of  ASN  protein  at  temperatures 
T  =  0.026-0.032.  Simulations  are  performed  on  a  64'  lattice  for  107  time  steps  with  100  independent  samples  at  each 
temperature  with  BT  potential;7-’  the  estimates  of  mobility  involve  about  a  billion  moves  for  each  residue. 


Residue  (BT) 

,0  10  20  30  40  50  60  70  80  90  100  110  120  130  140 

>  IIIIII1III|I  1 1 1 1 1 1 JWI I  ■  1 1 1 II  ■  1 1 M  ||]  ■  ■  ■  I  |  ■  I  !  m  I  1 1 1 1  M  1 1  1  Ml  I  I  II II  I  1 1 1  I !  1 1  I  1 1  1 1 1  1 1  I  1 1  1 1  I  1 1  1 1 1 1 1  I  1 1  U  1 1 1  I  1 1 1 1  I  1 1  1 1 1 1 1  1 1 1 1 1  1 1 1  1 1  1 1 1  1 1 1 1  la  1 1  1 1  1 1 1  1 1 1 1 1_ 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


1 1 1 . 1 1 . 1 


0  10  20  30  40  50  60  70  80  90  100  110  120  130  140 

Residue  (BT) 


FIG.  4.  Average  number  (N„)  of  surrounding  residues  within  the  range  of  interaction  of  each  residue  of  ASN  protein  at 
temperatures  T  =  0.026-0.032.  Statistics  is  the  same  as  figure  3. 
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The  peaks  observed  at  lower  temperatures  (T  =  0.026,  0.028)  are  signatures  of  intra-chain  self¬ 
organization  (either  protein  folding  or  aggregation)  of  the  underlying  residues.  The  largest  fraction 
of  interacting  residues  are  located  in  the  N-terminal  region  (1-61)  and  in  the  C-terminal  region 
(96-140).  Lack  of  organization  in  the  NAC  region  (61-95)  makes  it  possible  for  ASN  to  assemble 
in  a  fibrous  structure.  Structure  formation  in  the  N-terminal  region  involve  10K,  12K,  13E,  20E,  2IK, 
23K,  21  A,  32K,  34K,  35E,  43K,  45K,  46E,  57E,  58K,  60K,  61E  and  96K,  97K,  102K,  104E,  1()5E,  126E,  129S, 
130E,  131E,  i37E  in  the  C-terminal  region.  The  residue-residue  interactions  in  the  N-terminal  and 
C-terminal  regions  of  the  protein  suggests  the  important  role  N  and  C-terminals  play  in  control¬ 
ling  the  multi-scale  structure  of  the  protein.  The  formation  of  ASN  fibers  have  been  associated 
with  intermolecular  |3-sheet  formation,  which  cannot  be  directly  identified  from  our  coarse-grained 
model.  However,  we  note  that  the  spacing  and  frequency  of  the  interacting  residues  (primarily  due 
to  the  location  of  E  and  K  residues)  are  consistent  with  the  general  structural  patterns  reported  in 
Ref.  34. 

Stochastic  movement  of  residues  and  their  transient  settling  lead  to  global  motion  of  the  protein 
that  depends  on  the  temperature.  The  global  motion  can  be  examined  by  analyzing  the  variation 
of  the  root  mean  square  displacement  (RMS)  of  the  center  of  mass  of  the  protein  (Rc)  as  a  func¬ 
tion  of  time  (t).  Figure  5  shows  the  RMS  displacement  Rc  as  a  function  of  time  over  a  range  of 
temperatures  (T  =  0.026-0.034).  The  asymptotic  variation  shows  the  power-law  dependence  of  the 
RMS  displacement  on  the  time  step  (t),  with  Rc  oc  tv.  The  power-law  exponent  v  provides  an  insight 
into  the  type  of  motion,  including  diffusion  (v  =  l/2),  sub-diffusion  (v  <  l/i),  and  drift  dynamics 
(v  =  1).  Figure  5  shows  a  systematic  change  in  the  magnitude  of  the  exponent  from  diffusive  v  =  x/i 
motion  at  high  temperature  (T  =  0.034)  to  sub-diffusive  v  <  V2  dynamics  at  low  temperatures.  The 
power-law  exponent  v  is  so  low  at  lower  temperatures  (T  =  0.026,  0.027)  that  the  protein  is  essen¬ 
tially  stationary  in  our  simulations.  We  have  also  analyzed  the  RMS  displacement  of  the  center  node 
(70V)  which  follows  the  global  dynamics  of  the  protein  in  the  long  time. 

The  protein  size  depends  on  the  residue-residue  interactions  and  the  temperature,  and  is  charac¬ 
terized  by  the  radius  of  gyration  ( Rg ).  We  have  analyzed  the  radius  of  gyration  in  detail  as  a  function 


FIG.  5.  Variation  of  the  root  mean  square  (RMS)  displacement  of  the  center  of  mass  of  the  protein  with  the  time  steps.  The 
numbers  along  the  final  data  points  are  the  value  of  the  exponent  v  in  Rc  oc  tv  (v  =  !/2  shows  diffusion,  v  <  !/2  is  sub-diffusion). 
Simulations  are  performed  on  a  643  lattice  for  107  time  steps  with  100-1000  independent  samples  at  each  temperature  with 
BT  potential.53 
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FIG.  6.  Variation  of  the  equilibrium  average  radius  of  gyration  versus  temperature  with  two  knowledge-based  potentials, 
classic  VI J  1  and  improved  BT  '  potentials.  Inset  is  the  data  from  Allison  et  al. ' '  Estimates  of  Rg  involve  average  over  the 
time  steps  (last  one-third  to  one-half,  i.e.,  the  data  from  last  2.5  X  106  to  1.6x  106  MCS  time)  and  100  independent  samples 
each  on  a  643  lattice. 


of  temperature  as  the  protein  relaxes  and  reaches  its  equilibrium.  Rg  was  calculated  from  the  last 
half  to  one-third  of  the  total  time  steps  (i.e.,  last  2.5  x  106  to  1 .6  x  106  MCS  time)  and  was  averaged 
over  a  100  independent  samples. 

Figure  6  shows  the  variation  of  the  equilibrium  radius  of  gyration  as  a  function  of  temperature 
simulated  using  two  knowledge-based  residue-residue  interaction  potentials  (BT,  MJ).51,53  Differ¬ 
ences  in  the  BT  and  MJ  potentials  leads  to  a  different  temperature  range  for  the  unfolding  transition. 
While  quantitatively  different,  the  variation  remain  similar  qualitatively.  A  data  set  is  also  included 
from  table  3  of  Alison  et  al.32  which  may  be  the  estimate  of  R„  in  a  different  solvent  medium  for 
comparison.  This  result  shows  that  the  qualitative  variation  of  Rg  with  the  temperature  are  similar  to 
those  reported  Allison  et  al.32 

Structure  factor,  S(q),  of  the  protein  provides  an  insight  into  its  multi-scale  structures.  We  have 
studied  the  structure  factor,  S(q),  as  its  structures  evolve  with  the  temperature, 


N 

XX 

j= 1 


itfrj 


>l?1 


(2) 


where  rj  is  the  position  of  each  residue  and  \q\  =  2nf  A  is  the  wave  vector  of  wavelength,  A.  From 
the  power-law  scaling  of  the  structure  factor  with  the  wave  vector,  S(q)  oc  q  l  /V),  one  can  estimate 
the  spatial  distribution  of  residues  in  the  protein  by  analyzing  its  radius  of  gyration  ( Rg ).  The 
scaling  of  the  radius  of  gyration  of  the  protein  chain  with  the  number  N  of  its  nodes  (residues), 
i.e.,  Rg  oc  Nr  provides  an  insight  into  the  shape  of  the  chain  and  allows  us  to  distinguish  between 
random  coil  and  globular  protein  conformations.  For  example,  y  =  1fi  represents  a  random-coil 
conformation  of  the  protein.  Conversely,  one  can  also  estimate  the  effective  dimension  ( D )  of  the 
residue  distributions  within  the  radius  ( Rg )  of  the  protein,  i.e.,  N  oc  Rg°,D  =  1  /y.  Estimates  of 
these  exponents  for  shape  and  mass  distribution  (y,  D)  of  protein  requires  evaluation  of  Rg  for  a 
number  of  different  N.  Unfortunately,  we  have  only  a  fixed  number  ( N )  of  residues  in  a  protein, 
therefore,  scaling  of  Rg  with  N  is  not  an  option  to  evaluate  the  mass  distribution  (i.e.,  structure) 
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FIG.  7.  Variation  of  the  structure  factor  with  the  wave  vector  q  for  a  range  of  temperatures  with  BT  potential.5-1  Slopes  of 
the  fitted  data  points  at  three  representative  temperatures  are  included  to  show  the  changes.  Insets  are  the  variation  of  the 
corresponding  radius  of  gyration  with  the  temperature  and  that  of  the  wave  vector  q  with  the  spatial  (r)  scale.  A  set  of  data 
with  only  excluded  volume  interaction  (e  =  0)  is  also  included  for  reference,  and  the  theoretical  analysis  presented  in  Ref.  3 1 
should  be  comparable  to  this  reference  set.  Simulations  are  performed  on  a  643  lattice  for  107  time  steps  with  100  independent 
samples  at  each  temperature. 


of  the  protein.  However,  we  can  estimate  the  exponents  of  the  mass  distribution  of  protein  by 
analyzing  the  structure  factor  over  almost  all  length  scales  including  local  segments. 

Figure  7  shows  the  variation,  S(q),  with  the  wave  vector  q  with  the  BT  potential.  By  fitting  the 
data  points  of  comparable  proteins  (Rg  «  T)  at  appropriate  temperatures,  we  evaluate  the  effective 
dimension  of  ASN.  Our  data  clearly  shows  a  random  coil  structure  (D  «  3)  at  the  low  temperature 
T  =  0.026  and  random  coil  (D  ~  2),  less  than  2  if  we  shift  the  fitting  towards  lower  q  which  may 
be  more  towards  self-avoiding  walk  rather  than  random  walk  at  high  temperature  T  =  0.032.  These 
estimates  (y  =  1  /D)  of  y  are  consistent  with  the  observations  made  by  Uversky  et  al.  (see  their 
equation  2  and  3). 6  In  the  absence  of  residue-residue  interaction  (e  =  0),  the  above  scaling  gives 
D  &  1.7  with  corresponding  SAW  exponent  y  =  0.59.  Thus  the  structure  function  provides  the 
overall  shapes  and  size  of  the  ASN  over  the  range  of  temperatures,  from  low  to  high. 


IV.  CONCLUSIONS 

A  coarse-grained  simulation55  56  with  knowledge-based  interaction  potentials50,52  is  used  to 
investigate  the  structure  of  the  intrinsically  disordered  protein  ASN  as  a  function  of  temperature.  In 
their  recent  study,  Jonsson  et  al.  ’ 1  have  noted  that  MC  simulations  are  more  effective  for  exploring 
the  phase  space  of  the  conformational  ensemble.  We  have  implemented  an  efficient  and  effective 
Monte  Carlo  scheme  with  phenomenological  residue-residue  interactions  in  order  to  capture  the 
large  scale  (both  length  and  temporal)  features  without  sacrificing  small  scale  resolution.  The 
coarse-grained  study  of  free  ASN  presented  here  complements  the  extensive  atomic-scale  studies 
while  allowing  us  to  explore  a  larger  phase  space.  We  are  able  to  analyze  a  number  of  local  and 
global  physical  quantities  and  identify  the  trends  in  structural  evolution  of  ASN  as  a  function 
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of  temperature.  Even  though  the  reduced  units  (length  and  time)  are  arbitrary,  these  simulations 
provides  an  insight  into  the  the  trends  in  the  structure  and  dynamics  as  a  function  of  temperature. 

The  contour  maps  of  residue-residue  interactions  from  the  MC  simulations  provides  an  insight 
into  the  structure  as  a  function  of  temperature  and  the  mobility  profiles  allows  us  to  identify  the 
most  active  residue  promoting  intra-  and  intermolecular  interactions.  We  find  that  different  seg¬ 
ments  of  the  protein,  i.e.,  N-terminal  (1-60),  C-terminal  (96-140),  and  NAC  (61-95)  regions,  can 
be  clearly  distinguished  based  on  the  mobile  profile,  and  we  are  able  to  identify  the  least  mo¬ 
bile  residues  in  the  N-terminal  and  C-terminal  regions  around  which  the  local  structures  may  be 
organized. 

We  find  that  the  residue-residue  contacts  that  may  lead  to  higher  order  multimer  structures  are 
present  at  a  higher  level  in  the  N-terminal  region  compared  the  C-terminal,  and  little  globular  struc¬ 
ture  is  observed  in  the  in  the  NAC  region.  In  coordination  with  the  mobility  profile,  the  intra-protein 
structural  profile  provides  us  a  mechanism  to  identify  the  residues  that  act  as  seeds  to  form  local 
clumps  in  the  N-terminal  and  C-terminal  regions  of  the  ASN.  To  our  knowledge,  identification  of 
specific  residues  with  the  least  mobility  and  those  that  can  act  as  seeds  for  local  aggregation  are  not 
observed  in  atomic-scale  simulations. 

Analysis  of  the  global  physical  quantities  (RMS  displacements  of  ASN,  its  radius  of  gyration, 
and  structure  factor)  provides  insight  into  (i)  the  relaxation  and  characteristic  dynamics,  (if)  vari¬ 
ation  of  the  overall  shape,  and  (Hi)  its  scaling  with  size  (distribution  of  residues)  as  a  function  of 
temperature.  We  are  able  to  predict  the  diffusive  and  sub-diffusive  nature  of  the  global  dynamics  on 
reducing  the  temperature  from  high  to  low  values.  Such  characteristics  will  help  in  understanding 
how  fast  ASN  can  respond  at  different  temperatures.  We  find  that  the  radius  of  gyration  of  free  ASN 
decreases  continuously  on  reducing  the  temperature,  similar  to  findings  reported  by  Allison  et  al.31 
Additionally,  we  are  able  to  identify  the  random  coil  conformation  at  high  temperature  and  globular 
structures  at  lower  temperatures,  which  help  with  understanding  how  the  residues  distribute  at 
different  temperature.  The  scaling  of  the  size  of  ASN  with  its  molecular  weight  are  consistent  with 
the  findings  reported  by  Uversky  et  al.6  ASN  has  been  extensively  studied  both  experimentally  as 
well  as  computationally  with  growing  interest  to  uncover  its  unknown  characteristics  that  may  be 
related  to  function  and  disease  states.  We  hope  our  findings  of  the  structural  response  of  free  ASN  to 
temperature,  complement  the  current  simulations,  and  provide  additional  tools  for  the  interpretation 
of  the  laboratory  observations.  There  are  many  parameters  such  as  the  effects  of  solvent  and  the 
underlying  matrix  in  vivo  and  in  vitro  scenarios  including  protein  concentration61  that  we  will 
explore  in  our  future  efforts. 
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